HTML Tags as Extraction Cues for Web Page Description Construction

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HTML Tags as Extraction Cues for Web Page Description Construction

Using four previously identified samples of Web pages containing meta-tagged descriptions, the value of meta-tagged keywords, the first 200 characters of the body, and text marked with common HTML tags as extracts helpful for writing summaries was estimated by applying two measures: density of description words and density of two-word description phrases. Generally, titles and keywords showed t...

متن کامل

HTML Page Analysis Based on Visual Cues

In this paper, we present a novel approach to automatically analyzing semantic structure of HTML pages based on detecting visual similarities of content objects on web pages. The approach is developed based on the observation that in most web pages, layout styles of subtitles or records of the same content category are consistent and there are apparent separation boundaries between different ca...

متن کامل

Embedding Secret Data in HTML Web Page

In this paper, we suggest a novel data hiding technique in an Html Web page. Html Tags are case insensitive and hence an alphabet in lowercase and one in uppercase present inside an html tag are interpreted in the same manner by the browser, i.e., change in case in an web page is imperceptible to the browser. We basically exploit this redundancy and use it to embed secret data inside an web pag...

متن کامل

On Reducing Dynamic Web Page Construction Times

Many web sites incorporate dynamic web pages to deliver customized contents to their users. However, dynamic pages result in increased user response times due to their construction overheads. In this paper, we consider mechanisms for reducing these overheads by utilizing the excess capacity with which web servers are typically provisioned. Specifically, we present a caching technique that integ...

متن کامل

Automatic Extraction of Generic Web Page Components

Information on the World Wide Web is accessed not just visually, but also automatically by systems, such as search engines and alternative browsers (e.g. screen readers and voice browsers), which extract and present relevant data automatically from Web pages. In most cases extraction cannot be performed directly, since HTML documents of today lack adequate semantic markup. This thesis proposes ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Informing Science: The International Journal of an Emerging Transdiscipline

سال: 2003

ISSN: 1547-9684,1521-4672

DOI: 10.28945/509